Item-by-item results
The following section presents the item-by-item results of the analysis. Each item has several tables and a
figure. The figure, called a quantile plot, shows the proportion of examinees selecting each option, for
consecutive segments of the examinees as ranked by score. The key thing to evaluate in this figure is that
the line for the correct answer has a positive slope (goes up from left to right), which means that examinees
with higher scores tend to answer correctly more often. Conversely, the lines for the incorrect options, called
distractors, should have a negative slope. Note, however, that the use of a small number of groups (e.g., 3
or fewer) oversimplifies the graph, so that items which are very difficult or very easy (that is, discriminating in
only the top or bottom 20% of examinees) might appear to have poor quantile plots and classical statistics.
For such items, item response theory presents significant advantages in analysis
There are four tables presented for each item.
1. Item information table: records the information supplied by the control file (or Iteman 3 header) for this
item.
2. Item statistics table: overall item statistics.
3. Option statistics: detailed statistics for each item, which helps diagnose issues in items with poor
statistics.
4. Quantile plot data: the values used to create the quantile plot.
The item statistics table presents overall item statistics in the first row of numbers. The two most important
item-level statistics for dichotomously scored (correct/incorrect) items are the P value and the point-biserial
correlation, which represent the difficulty and discrimination of the item, respectively. For polytomously
scored (rating scale or partial credit) items, the difficulty is represented by the mean (average) item score,
while the discrimination is represented by a Pearson r correlation.
The P value is the proportion of examinees that answered an item in the keyed direction. P ranges from 0 to
1. A high value (0.95) means that an item is easy, a low value (0.25) means that the item is difficult. The
point-biserial correlation (Rpbis) is a measure of the discriminating, or differentiating, power of the item.
Rpbis ranges from -1 to 1. A negative Rpbis is indicative of a bad item as lower scoring examinees are more
likely than higher scoring examinees to respond in the keyed direction.
For rating scale or partial credit items, the mean item score ranges from the minimum to the maximum of the
scale. For example, if the item has a rating scale of 1 to 5, the possible range for the mean is 1 to 5. The
Pearson r is similar to the Rpbis in that it ranges from -1 to 1, with a positive r indicating that the item
correlates well with total score.
The option statistics table presents statistics for each individual option (alternative). The key thing to
examine in this portion of the table is that no distractors have a higher Rpbis than the correct answer. That
indicates that higher scoring examinees are selecting the incorrect answer, which therefore might be arguably
correct.
The quantile plot data table simply presents the values calculated to create the quantile plot. Because it
contains the same information, the quantile plot itself presents a useful picture of the item's performance, but
this table can be used to examine that performance in detail to help diagnose possible issues.